# Replication Materials for "Inaccurate Forecasting of a Randomized Controlled Trial"
Mats Ahrenshop, Miriam A. Golden, Saad Gulzar, and Luke Sonnet

April 2023

### R scripts

1. `clean_rawdata.R`

Takes as input the various files with anonymized raw forecasting data `cerp_anon.csv`, `ucsd_anon.csv`, `us_anon.csv`, and merges and cleans them to produce the dataset used in the analysis, `forecasting_clean.csv`.

2. `paper.R`

Takes as input (a) the main replication dataset `forecasting_clean.csv` (b) the dataset that contains the pilot and main IVR estimates `ivr_mus.csv` that are provided to forecasters and (c) the dataset reporting RCTs published in professional journals `rct_count.xlsx`.  Produces all tables and figures included in the article and the appendix in the order in which they appear. To aid reproduction, each table/figure has its own subsection so it can be quickly located in the  script via the table of contents function.

### Datasets

1. `ivr_mus.csv`

Contains the estimates from the pilot and main IVR paper authored by Golden-Gulzar-Sonnet that are reported in Table 3. Because these figures are simply used as point estimates and derive from another paper, they are presented here as is; i.e. without the code that generated them in the original study that forecasters were asked to consider. We use only these mean values as comparison points for the forecasts.

In `ivr_mus.csv`:

* `comparison` -- The comparison of interest, i.e. the compliance metric
* `pilot_mu` -- The estimate for this comparison taken from the IVR pilot paper
* `intervention_mu` -- The estimate for this comparison taken from the main IVR intervention

`ivr_mus.csv` is read in and the data used to produce part of Table 3.

2. `cerp_anon.csv`, `ucsd_anon.csv`, `us_anon.csv`

Contain the anonymized raw data files from forecasting sites. They are read in and processed for cleaning and recoding in the R script `clean_rawdata.R`. The output of the merging and cleaning process is the replication dataset, `forecasting_clean.csv`.

3. `rct_counts.xlsx`

Contains, for each of the three journals *APSR*, *AJPS*, and *JOP*, abstracts of articles that were coded to report information about the publication of RCTs in these journals.

In `rct_counts.xlsx`:

* `year`-- The year in which the article was published
* `abstract` -- The complete text of the abstract of the article
* `rct` -- A dummy variable for whether the article reports results from an RCT (1) or some other type of study (0)
* `result` -- A dummy variable for whether the main treatment effect on the main outcome variable as hypothesized in the article is null (0) or not null (1).

4. `forecasting_clean.csv`

Contains the merged and cleaned anonymized dataset produced by `clean_rawdata.R`.

In `forecasting_clean.csv`:

* `position` -- the professional status of the forecaster 
* `optimism` -- the degree of optimism about the potential for Information Technology (IT) to improve governance, measured on a scale from 1 to 5, with 1 being 'very pessimistic' and 5 being 'very optimistic'
* `familiarity` -- the degree of familiarity with the use of IT to improve governance, measured on a scale from 1 to 4, with 1 being 'very unfamiliar' and 4 being 'very familiar'
* `ans_phone` -- the forecasted compliance rate of the main IVR study for answering the IVR phone call
* `ans_quest` -- the forecasted compliance rate of the main IVR study for answering a question conditional on answering the IVR phone call
* `wave` -- the research site at which the forecasting experiment was conducted
* `call_account` -- the forecast ITT of the main IVR study for the effect of the call treatment on the accountability index
* `call_account_rev` -- the revised forecast ITT of the main IVR study for the effect of the call treatment on the accountability index
* `call_eval_gov` -- the forecast ITT of the main IVR study for the effect of the call treatment on the government evaluation index
* `call_eval_gov_rev` -- the revised forecast ITT of the main IVR study for the effect of the call treatment on the government evaluation index
* `call_eval_mpa` -- the forecast ITT of the main IVR study for the effect of the call treatment on the MPA evaluation index
* `call_eval_mpa_rev` -- the revised forecast ITT of the main IVR study for the effect of the call treatment on the MPA evaluation index
* `responsive_account` -- the forecast ITT of the main IVR study for the effect of the responsive treatment on the accountability index
* `responsive_account_rev` -- the revised forecast ITT of the main IVR study for the effect of the responsive treatment on the accountability index
* `responsive_eval_gov` -- the forecast ITT of the main IVR study for the effect of the responsive treatment on the government evaluation index
* `responsive_eval_gov_rev` -- the revised forecast ITT of the main IVR study for the effect of the responsive treatment on the government evaluation index
* `responsive_eval_mpa` -- the forecast ITT of the main IVR study for the effect of the responsive treatment on the MPA evaluation index
* `responsive_eval_mpa_rev` -- the revised forecast ITT of the main IVR study for the effect of the responsive treatment on the MPA evaluation index
* `pilot_first` -- a binary variable that is 1 if respondent received the prime about the pilot study before making the ITT forecast and 0 if after the ITT forecast
* `compliance_first` -- a binary variable that is 1 if respondent is asked to forecast compliance before making the ITT forecast and 0 if after the ITT forecast
* `optimism_group` -- a categorical variable that recodes as 'Optimistic' respondents who answer either 4 or 5 on the `optimism` variable, as 'Neutral' respondents who answer 3 on the `optimism` variable, and as 'Pessimistic' respondents who answer either 1 or 2 on the `optimism` variable
* `familiar` -- a binary variable that recodes as 1 respondents who answer either 3 or 4 on the `familiarity` variable and as 0 respondents who answer either 1 or 2 on the `familiarity` variable.
* `call_avg` -- the average forecast ITT of the main IVR study for the effect of the call treatment across the accountability, MPA evaluation, and government evaluation indices
* `resp_avg` -- the average forecast ITT of the main IVR study for the effect of the responsive treatment across the accountability, MPA evaluation and government evaluation indices
